Overview

Dataset statistics

Number of variables31
Number of observations101766
Missing cells0
Missing cells (%)0.0%
Duplicate rows4
Duplicate rows (%)< 0.1%
Total size in memory46.0 MiB
Average record size in memory474.3 B

Variable types

Categorical18
Numeric13

Alerts

Dataset has 4 (< 0.1%) duplicate rowsDuplicates
age is highly overall correlated with age_sqHigh correlation
admission_source is highly overall correlated with admission_typeHigh correlation
age_sq is highly overall correlated with ageHigh correlation
invisits is highly overall correlated with sum_visitsHigh correlation
admission_type is highly overall correlated with admission_sourceHigh correlation
change_meds is highly overall correlated with diabetes_meds and 2 other fieldsHigh correlation
diabetes_meds is highly overall correlated with change_meds and 2 other fieldsHigh correlation
insulin is highly overall correlated with change_meds and 2 other fieldsHigh correlation
no_meds is highly overall correlated with change_meds and 2 other fieldsHigh correlation
sum_visits is highly overall correlated with invisitsHigh correlation
race is highly imbalanced (52.2%)Imbalance
glucose_test_result is highly imbalanced (81.2%)Imbalance
glimepiride is highly imbalanced (70.9%)Imbalance
glyburide is highly imbalanced (51.6%)Imbalance
pioglitazone is highly imbalanced (62.7%)Imbalance
repaglinide is highly imbalanced (88.7%)Imbalance
rosiglitazone is highly imbalanced (66.3%)Imbalance
payer_code has 4655 (4.6%) zerosZeros
non_lab_procedures has 46652 (45.8%) zerosZeros

Reproduction

Analysis started2023-11-13 14:44:47.714638
Analysis finished2023-11-13 14:45:21.831520
Duration34.12 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

race
Categorical

IMBALANCE 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.3 MiB
Caucasian
74146 
AfricanAmerican
18720 
Not Registered
 
4839
Hispanic
 
1982
Other
 
1458

Length

Max length15
Median length9
Mean length10.240267
Min length5

Characters and Unicode

Total characters1042111
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCaucasian
2nd rowCaucasian
3rd rowCaucasian
4th rowNot Registered
5th rowCaucasian

Common Values

ValueCountFrequency (%)
Caucasian 74146
72.9%
AfricanAmerican 18720
 
18.4%
Not Registered 4839
 
4.8%
Hispanic 1982
 
1.9%
Other 1458
 
1.4%
Asian 621
 
0.6%

Length

2023-11-13T14:45:21.904977image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:22.075868image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
caucasian 74146
69.6%
africanamerican 18720
 
17.6%
not 4839
 
4.5%
registered 4839
 
4.5%
hispanic 1982
 
1.9%
other 1458
 
1.4%
asian 621
 
0.6%

Most occurring characters

ValueCountFrequency (%)
a 262481
25.2%
i 121010
11.6%
n 114189
11.0%
c 113568
10.9%
s 81588
 
7.8%
C 74146
 
7.1%
u 74146
 
7.1%
r 43737
 
4.2%
A 38061
 
3.7%
e 34695
 
3.3%
Other values (13) 84490
 
8.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 911947
87.5%
Uppercase Letter 125325
 
12.0%
Space Separator 4839
 
0.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 262481
28.8%
i 121010
13.3%
n 114189
12.5%
c 113568
12.5%
s 81588
 
8.9%
u 74146
 
8.1%
r 43737
 
4.8%
e 34695
 
3.8%
m 18720
 
2.1%
f 18720
 
2.1%
Other values (6) 29093
 
3.2%
Uppercase Letter
ValueCountFrequency (%)
C 74146
59.2%
A 38061
30.4%
N 4839
 
3.9%
R 4839
 
3.9%
H 1982
 
1.6%
O 1458
 
1.2%
Space Separator
ValueCountFrequency (%)
4839
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1037272
99.5%
Common 4839
 
0.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 262481
25.3%
i 121010
11.7%
n 114189
11.0%
c 113568
10.9%
s 81588
 
7.9%
C 74146
 
7.1%
u 74146
 
7.1%
r 43737
 
4.2%
A 38061
 
3.7%
e 34695
 
3.3%
Other values (12) 79651
 
7.7%
Common
ValueCountFrequency (%)
4839
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1042111
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 262481
25.2%
i 121010
11.6%
n 114189
11.0%
c 113568
10.9%
s 81588
 
7.8%
C 74146
 
7.1%
u 74146
 
7.1%
r 43737
 
4.2%
A 38061
 
3.7%
e 34695
 
3.3%
Other values (13) 84490
 
8.1%

age
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.931028
Minimum0
Maximum10
Zeros158
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2023-11-13T14:45:22.180783image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q16
median7
Q38
95-th percentile9
Maximum10
Range10
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.8633167
Coefficient of variation (CV)0.26883698
Kurtosis1.5255955
Mean6.931028
Median Absolute Deviation (MAD)1
Skewness-1.0959809
Sum705343
Variance3.471949
MonotonicityNot monotonic
2023-11-13T14:45:22.278723image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
8 25334
24.9%
7 21885
21.5%
6 16780
16.5%
9 16777
16.5%
5 9407
 
9.2%
4 3677
 
3.6%
1 2757
 
2.7%
10 2714
 
2.7%
3 1609
 
1.6%
2 668
 
0.7%
ValueCountFrequency (%)
0 158
 
0.2%
1 2757
 
2.7%
2 668
 
0.7%
3 1609
 
1.6%
4 3677
 
3.6%
5 9407
 
9.2%
6 16780
16.5%
7 21885
21.5%
8 25334
24.9%
9 16777
16.5%
ValueCountFrequency (%)
10 2714
 
2.7%
9 16777
16.5%
8 25334
24.9%
7 21885
21.5%
6 16780
16.5%
5 9407
 
9.2%
4 3677
 
3.6%
3 1609
 
1.6%
2 668
 
0.7%
1 2757
 
2.7%

payer_code
Real number (ℝ)

ZEROS 

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.354303
Minimum-1
Maximum99
Zeros4655
Zeros (%)4.6%
Negative40256
Negative (%)39.6%
Memory size5.6 MiB
2023-11-13T14:45:22.377130image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median6
Q37
95-th percentile14
Maximum99
Range100
Interquartile range (IQR)8

Descriptive statistics

Standard deviation12.932999
Coefficient of variation (CV)2.4154402
Kurtosis41.668381
Mean5.354303
Median Absolute Deviation (MAD)6
Skewness6.1383757
Sum544886
Variance167.26245
MonotonicityNot monotonic
2023-11-13T14:45:22.474711image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
-1 40256
39.6%
7 32439
31.9%
6 6274
 
6.2%
14 5007
 
4.9%
0 4655
 
4.6%
8 3532
 
3.5%
3 2533
 
2.5%
15 2448
 
2.4%
2 1937
 
1.9%
99 1652
 
1.6%
ValueCountFrequency (%)
-1 40256
39.6%
0 4655
 
4.6%
2 1937
 
1.9%
3 2533
 
2.5%
6 6274
 
6.2%
7 32439
31.9%
8 3532
 
3.5%
10 1033
 
1.0%
14 5007
 
4.9%
15 2448
 
2.4%
ValueCountFrequency (%)
99 1652
 
1.6%
15 2448
 
2.4%
14 5007
 
4.9%
10 1033
 
1.0%
8 3532
 
3.5%
7 32439
31.9%
6 6274
 
6.2%
3 2533
 
2.5%
2 1937
 
1.9%
0 4655
 
4.6%

invisits
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
0
67630 
1
19521 
2
7566 
4
 
3638
3
 
3411

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 67630
66.5%
1 19521
 
19.2%
2 7566
 
7.4%
4 3638
 
3.6%
3 3411
 
3.4%

Length

2023-11-13T14:45:22.575924image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:22.687763image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
0 67630
66.5%
1 19521
 
19.2%
2 7566
 
7.4%
4 3638
 
3.6%
3 3411
 
3.4%

Most occurring characters

ValueCountFrequency (%)
0 67630
66.5%
1 19521
 
19.2%
2 7566
 
7.4%
4 3638
 
3.6%
3 3411
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 67630
66.5%
1 19521
 
19.2%
2 7566
 
7.4%
4 3638
 
3.6%
3 3411
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 67630
66.5%
1 19521
 
19.2%
2 7566
 
7.4%
4 3638
 
3.6%
3 3411
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 67630
66.5%
1 19521
 
19.2%
2 7566
 
7.4%
4 3638
 
3.6%
3 3411
 
3.4%

admission_type
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
1
53990 
0
18900 
6
18480 
3
10396 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row3
3rd row6
4th row6
5th row1

Common Values

ValueCountFrequency (%)
1 53990
53.1%
0 18900
 
18.6%
6 18480
 
18.2%
3 10396
 
10.2%

Length

2023-11-13T14:45:22.786777image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:22.900206image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
1 53990
53.1%
0 18900
 
18.6%
6 18480
 
18.2%
3 10396
 
10.2%

Most occurring characters

ValueCountFrequency (%)
1 53990
53.1%
0 18900
 
18.6%
6 18480
 
18.2%
3 10396
 
10.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 53990
53.1%
0 18900
 
18.6%
6 18480
 
18.2%
3 10396
 
10.2%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 53990
53.1%
0 18900
 
18.6%
6 18480
 
18.2%
3 10396
 
10.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 53990
53.1%
0 18900
 
18.6%
6 18480
 
18.2%
3 10396
 
10.2%

disposition
Real number (ℝ)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.2777155
Minimum1
Maximum23
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2023-11-13T14:45:23.020856image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q311
95-th percentile21
Maximum23
Range22
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.4155546
Coefficient of variation (CV)1.1812505
Kurtosis-0.29985291
Mean6.2777155
Median Absolute Deviation (MAD)0
Skewness1.0865706
Sum638858
Variance54.990451
MonotonicityNot monotonic
2023-11-13T14:45:23.115388image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 60234
59.2%
11 17516
 
17.2%
21 14577
 
14.3%
7 2216
 
2.2%
6 2176
 
2.1%
4 2090
 
2.1%
9 1184
 
1.2%
23 989
 
1.0%
5 412
 
0.4%
20 372
 
0.4%
ValueCountFrequency (%)
1 60234
59.2%
4 2090
 
2.1%
5 412
 
0.4%
6 2176
 
2.1%
7 2216
 
2.2%
9 1184
 
1.2%
11 17516
 
17.2%
20 372
 
0.4%
21 14577
 
14.3%
23 989
 
1.0%
ValueCountFrequency (%)
23 989
 
1.0%
21 14577
 
14.3%
20 372
 
0.4%
11 17516
 
17.2%
9 1184
 
1.2%
7 2216
 
2.2%
6 2176
 
2.1%
5 412
 
0.4%
4 2090
 
2.1%
1 60234
59.2%

admission_source
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.2910501
Minimum1
Maximum15
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2023-11-13T14:45:23.203640image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q312
95-th percentile15
Maximum15
Range14
Interquartile range (IQR)11

Descriptive statistics

Standard deviation5.4378786
Coefficient of variation (CV)1.0277504
Kurtosis-1.5261728
Mean5.2910501
Median Absolute Deviation (MAD)0
Skewness0.59569808
Sum538449
Variance29.570524
MonotonicityNot monotonic
2023-11-13T14:45:23.291706image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1 57494
56.5%
12 30669
30.1%
3 6906
 
6.8%
15 5466
 
5.4%
8 867
 
0.9%
13 203
 
0.2%
4 161
 
0.2%
ValueCountFrequency (%)
1 57494
56.5%
3 6906
 
6.8%
4 161
 
0.2%
8 867
 
0.9%
12 30669
30.1%
13 203
 
0.2%
15 5466
 
5.4%
ValueCountFrequency (%)
15 5466
 
5.4%
13 203
 
0.2%
12 30669
30.1%
8 867
 
0.9%
4 161
 
0.2%
3 6906
 
6.8%
1 57494
56.5%

length_stay
Real number (ℝ)

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.3959869
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2023-11-13T14:45:23.389287image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile11
Maximum14
Range13
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.9851078
Coefficient of variation (CV)0.67905293
Kurtosis0.85025084
Mean4.3959869
Median Absolute Deviation (MAD)2
Skewness1.1339987
Sum447362
Variance8.9108684
MonotonicityNot monotonic
2023-11-13T14:45:23.492197image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
3 17756
17.4%
2 17224
16.9%
1 14208
14.0%
4 13924
13.7%
5 9966
9.8%
6 7539
7.4%
7 5859
 
5.8%
8 4391
 
4.3%
9 3002
 
2.9%
10 2342
 
2.3%
Other values (4) 5555
 
5.5%
ValueCountFrequency (%)
1 14208
14.0%
2 17224
16.9%
3 17756
17.4%
4 13924
13.7%
5 9966
9.8%
6 7539
7.4%
7 5859
 
5.8%
8 4391
 
4.3%
9 3002
 
2.9%
10 2342
 
2.3%
ValueCountFrequency (%)
14 1042
 
1.0%
13 1210
 
1.2%
12 1448
 
1.4%
11 1855
 
1.8%
10 2342
 
2.3%
9 3002
 
2.9%
8 4391
4.3%
7 5859
5.8%
6 7539
7.4%
5 9966
9.8%

num_tests
Real number (ℝ)

Distinct118
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.095641
Minimum1
Maximum132
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2023-11-13T14:45:23.613592image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q131
median44
Q357
95-th percentile73
Maximum132
Range131
Interquartile range (IQR)26

Descriptive statistics

Standard deviation19.674362
Coefficient of variation (CV)0.45652789
Kurtosis-0.24507352
Mean43.095641
Median Absolute Deviation (MAD)13
Skewness-0.23654392
Sum4385671
Variance387.08053
MonotonicityNot monotonic
2023-11-13T14:45:23.749027image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 3208
 
3.2%
43 2804
 
2.8%
44 2496
 
2.5%
45 2376
 
2.3%
38 2213
 
2.2%
40 2201
 
2.2%
46 2189
 
2.2%
41 2117
 
2.1%
42 2113
 
2.1%
47 2106
 
2.1%
Other values (108) 77943
76.6%
ValueCountFrequency (%)
1 3208
3.2%
2 1101
 
1.1%
3 668
 
0.7%
4 378
 
0.4%
5 286
 
0.3%
6 282
 
0.3%
7 323
 
0.3%
8 366
 
0.4%
9 933
 
0.9%
10 838
 
0.8%
ValueCountFrequency (%)
132 1
 
< 0.1%
129 1
 
< 0.1%
126 1
 
< 0.1%
121 1
 
< 0.1%
120 1
 
< 0.1%
118 1
 
< 0.1%
114 2
< 0.1%
113 3
< 0.1%
111 3
< 0.1%
109 4
< 0.1%

non_lab_procedures
Real number (ℝ)

ZEROS 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3397304
Minimum0
Maximum6
Zeros46652
Zeros (%)45.8%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2023-11-13T14:45:23.856036image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.705807
Coefficient of variation (CV)1.2732465
Kurtosis0.8571103
Mean1.3397304
Median Absolute Deviation (MAD)1
Skewness1.3164148
Sum136339
Variance2.9097775
MonotonicityNot monotonic
2023-11-13T14:45:23.941386image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 46652
45.8%
1 20742
20.4%
2 12717
 
12.5%
3 9443
 
9.3%
6 4954
 
4.9%
4 4180
 
4.1%
5 3078
 
3.0%
ValueCountFrequency (%)
0 46652
45.8%
1 20742
20.4%
2 12717
 
12.5%
3 9443
 
9.3%
4 4180
 
4.1%
5 3078
 
3.0%
6 4954
 
4.9%
ValueCountFrequency (%)
6 4954
 
4.9%
5 3078
 
3.0%
4 4180
 
4.1%
3 9443
 
9.3%
2 12717
 
12.5%
1 20742
20.4%
0 46652
45.8%

num_meds
Real number (ℝ)

Distinct75
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.021844
Minimum1
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2023-11-13T14:45:24.061883image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q110
median15
Q320
95-th percentile31
Maximum81
Range80
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.1275662
Coefficient of variation (CV)0.50728032
Kurtosis3.4681549
Mean16.021844
Median Absolute Deviation (MAD)5
Skewness1.3266721
Sum1630479
Variance66.057332
MonotonicityNot monotonic
2023-11-13T14:45:24.195333image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13 6086
 
6.0%
12 6004
 
5.9%
11 5795
 
5.7%
15 5792
 
5.7%
14 5707
 
5.6%
16 5430
 
5.3%
10 5346
 
5.3%
17 4919
 
4.8%
9 4913
 
4.8%
18 4523
 
4.4%
Other values (65) 47251
46.4%
ValueCountFrequency (%)
1 262
 
0.3%
2 470
 
0.5%
3 900
 
0.9%
4 1417
 
1.4%
5 2017
 
2.0%
6 2699
2.7%
7 3484
3.4%
8 4353
4.3%
9 4913
4.8%
10 5346
5.3%
ValueCountFrequency (%)
81 1
 
< 0.1%
79 1
 
< 0.1%
75 2
 
< 0.1%
74 1
 
< 0.1%
72 3
< 0.1%
70 2
 
< 0.1%
69 5
< 0.1%
68 7
< 0.1%
67 7
< 0.1%
66 5
< 0.1%

diag_1
Real number (ℝ)

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.5480416
Minimum1
Maximum18
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2023-11-13T14:45:24.313815image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q17
median7
Q310
95-th percentile17
Maximum18
Range17
Interquartile range (IQR)3

Descriptive statistics

Standard deviation4.4432368
Coefficient of variation (CV)0.51979589
Kurtosis-0.42835521
Mean8.5480416
Median Absolute Deviation (MAD)2
Skewness0.54847408
Sum869900
Variance19.742354
MonotonicityNot monotonic
2023-11-13T14:45:24.408432image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
7 30336
29.8%
3 11459
 
11.3%
8 10407
 
10.2%
9 9208
 
9.0%
16 7636
 
7.5%
17 6974
 
6.9%
10 5078
 
5.0%
13 4957
 
4.9%
2 3433
 
3.4%
1 2768
 
2.7%
Other values (6) 9510
 
9.3%
ValueCountFrequency (%)
1 2768
 
2.7%
2 3433
 
3.4%
3 11459
 
11.3%
4 1103
 
1.1%
5 2262
 
2.2%
6 1211
 
1.2%
7 30336
29.8%
8 10407
 
10.2%
9 9208
 
9.0%
10 5078
 
5.0%
ValueCountFrequency (%)
18 1666
 
1.6%
17 6974
 
6.9%
16 7636
 
7.5%
13 4957
 
4.9%
12 2530
 
2.5%
11 738
 
0.7%
10 5078
 
5.0%
9 9208
 
9.0%
8 10407
 
10.2%
7 30336
29.8%

diag_2
Real number (ℝ)

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.3805691
Minimum-1
Maximum18
Zeros0
Zeros (%)0.0%
Negative466
Negative (%)0.5%
Memory size5.6 MiB
2023-11-13T14:45:24.518403image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile3
Q13
median7
Q39
95-th percentile16
Maximum18
Range19
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.0940042
Coefficient of variation (CV)0.55470033
Kurtosis0.43950178
Mean7.3805691
Median Absolute Deviation (MAD)3
Skewness0.84795121
Sum751091
Variance16.76087
MonotonicityNot monotonic
2023-11-13T14:45:24.617267image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
7 31365
30.8%
3 21017
20.7%
8 10251
 
10.1%
10 7987
 
7.8%
16 4632
 
4.6%
9 3962
 
3.9%
12 3596
 
3.5%
4 2926
 
2.9%
5 2657
 
2.6%
2 2547
 
2.5%
Other values (7) 10826
 
10.6%
ValueCountFrequency (%)
-1 466
 
0.5%
1 1931
 
1.9%
2 2547
 
2.5%
3 21017
20.7%
4 2926
 
2.9%
5 2657
 
2.6%
6 1286
 
1.3%
7 31365
30.8%
8 10251
 
10.1%
9 3962
 
3.9%
ValueCountFrequency (%)
18 2536
 
2.5%
17 2428
 
2.4%
16 4632
 
4.6%
13 1764
 
1.7%
12 3596
 
3.5%
11 415
 
0.4%
10 7987
 
7.8%
9 3962
 
3.9%
8 10251
 
10.1%
7 31365
30.8%

diag_3
Real number (ℝ)

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.201757
Minimum-1
Maximum18
Zeros0
Zeros (%)0.0%
Negative1519
Negative (%)1.5%
Memory size5.6 MiB
2023-11-13T14:45:24.726430image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile2
Q13
median7
Q39
95-th percentile17
Maximum18
Range19
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.4895567
Coefficient of variation (CV)0.62339742
Kurtosis0.32121715
Mean7.201757
Median Absolute Deviation (MAD)3
Skewness0.92240615
Sum732894
Variance20.15612
MonotonicityNot monotonic
2023-11-13T14:45:24.827063image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
7 29918
29.4%
3 26308
25.9%
8 6774
 
6.7%
10 6327
 
6.2%
18 5058
 
5.0%
16 4523
 
4.4%
9 3572
 
3.5%
5 3136
 
3.1%
4 2490
 
2.4%
12 2488
 
2.4%
Other values (7) 11172
 
11.0%
ValueCountFrequency (%)
-1 1519
 
1.5%
1 1861
 
1.8%
2 1856
 
1.8%
3 26308
25.9%
4 2490
 
2.4%
5 3136
 
3.1%
6 1766
 
1.7%
7 29918
29.4%
8 6774
 
6.7%
9 3572
 
3.5%
ValueCountFrequency (%)
18 5058
 
5.0%
17 1946
 
1.9%
16 4523
 
4.4%
13 1915
 
1.9%
12 2488
 
2.4%
11 309
 
0.3%
10 6327
 
6.2%
9 3572
 
3.5%
8 6774
 
6.7%
7 29918
29.4%

num_diag
Real number (ℝ)

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.4226068
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2023-11-13T14:45:25.454331image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q16
median8
Q39
95-th percentile9
Maximum16
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.9336001
Coefficient of variation (CV)0.26050149
Kurtosis-0.079056024
Mean7.4226068
Median Absolute Deviation (MAD)1
Skewness-0.87674624
Sum755369
Variance3.7388095
MonotonicityNot monotonic
2023-11-13T14:45:25.549273image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
9 49474
48.6%
5 11393
 
11.2%
8 10616
 
10.4%
7 10393
 
10.2%
6 10161
 
10.0%
4 5537
 
5.4%
3 2835
 
2.8%
2 1023
 
1.0%
1 219
 
0.2%
16 45
 
< 0.1%
Other values (6) 70
 
0.1%
ValueCountFrequency (%)
1 219
 
0.2%
2 1023
 
1.0%
3 2835
 
2.8%
4 5537
 
5.4%
5 11393
 
11.2%
6 10161
 
10.0%
7 10393
 
10.2%
8 10616
 
10.4%
9 49474
48.6%
10 17
 
< 0.1%
ValueCountFrequency (%)
16 45
 
< 0.1%
15 10
 
< 0.1%
14 7
 
< 0.1%
13 16
 
< 0.1%
12 9
 
< 0.1%
11 11
 
< 0.1%
10 17
 
< 0.1%
9 49474
48.6%
8 10616
 
10.4%
7 10393
 
10.2%

glucose_test_result
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.3 MiB
Not tested
96420 
Normal
 
2597
Probably diabetic
 
1485
Diabetic
 
1264

Length

Max length17
Median length10
Mean length9.9752275
Min length6

Characters and Unicode

Total characters1015139
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot tested
2nd rowNormal
3rd rowNot tested
4th rowNot tested
5th rowNot tested

Common Values

ValueCountFrequency (%)
Not tested 96420
94.7%
Normal 2597
 
2.6%
Probably diabetic 1485
 
1.5%
Diabetic 1264
 
1.2%

Length

2023-11-13T14:45:25.665020image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:25.777583image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
not 96420
48.3%
tested 96420
48.3%
diabetic 2749
 
1.4%
normal 2597
 
1.3%
probably 1485
 
0.7%

Most occurring characters

ValueCountFrequency (%)
t 292009
28.8%
e 195589
19.3%
o 100502
 
9.9%
N 99017
 
9.8%
97905
 
9.6%
d 97905
 
9.6%
s 96420
 
9.5%
a 6831
 
0.7%
b 5719
 
0.6%
i 5498
 
0.5%
Other values (7) 17744
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 815468
80.3%
Uppercase Letter 101766
 
10.0%
Space Separator 97905
 
9.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 292009
35.8%
e 195589
24.0%
o 100502
 
12.3%
d 97905
 
12.0%
s 96420
 
11.8%
a 6831
 
0.8%
b 5719
 
0.7%
i 5498
 
0.7%
r 4082
 
0.5%
l 4082
 
0.5%
Other values (3) 6831
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
N 99017
97.3%
P 1485
 
1.5%
D 1264
 
1.2%
Space Separator
ValueCountFrequency (%)
97905
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 917234
90.4%
Common 97905
 
9.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 292009
31.8%
e 195589
21.3%
o 100502
 
11.0%
N 99017
 
10.8%
d 97905
 
10.7%
s 96420
 
10.5%
a 6831
 
0.7%
b 5719
 
0.6%
i 5498
 
0.6%
r 4082
 
0.4%
Other values (6) 13662
 
1.5%
Common
ValueCountFrequency (%)
97905
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1015139
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 292009
28.8%
e 195589
19.3%
o 100502
 
9.9%
N 99017
 
9.8%
97905
 
9.6%
d 97905
 
9.6%
s 96420
 
9.5%
a 6831
 
0.7%
b 5719
 
0.6%
i 5498
 
0.5%
Other values (7) 17744
 
1.7%

a1c_test_result
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.3 MiB
Not tested
84748 
Diabetic
12028 
Normal
 
4990

Length

Max length10
Median length10
Mean length9.5674783
Min length6

Characters and Unicode

Total characters973644
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot tested
2nd rowNot tested
3rd rowNot tested
4th rowNot tested
5th rowNot tested

Common Values

ValueCountFrequency (%)
Not tested 84748
83.3%
Diabetic 12028
 
11.8%
Normal 4990
 
4.9%

Length

2023-11-13T14:45:25.881767image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:26.003250image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
not 84748
45.4%
tested 84748
45.4%
diabetic 12028
 
6.4%
normal 4990
 
2.7%

Most occurring characters

ValueCountFrequency (%)
t 266272
27.3%
e 181524
18.6%
N 89738
 
9.2%
o 89738
 
9.2%
84748
 
8.7%
s 84748
 
8.7%
d 84748
 
8.7%
i 24056
 
2.5%
a 17018
 
1.7%
D 12028
 
1.2%
Other values (5) 39026
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 787130
80.8%
Uppercase Letter 101766
 
10.5%
Space Separator 84748
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 266272
33.8%
e 181524
23.1%
o 89738
 
11.4%
s 84748
 
10.8%
d 84748
 
10.8%
i 24056
 
3.1%
a 17018
 
2.2%
b 12028
 
1.5%
c 12028
 
1.5%
r 4990
 
0.6%
Other values (2) 9980
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
N 89738
88.2%
D 12028
 
11.8%
Space Separator
ValueCountFrequency (%)
84748
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 888896
91.3%
Common 84748
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 266272
30.0%
e 181524
20.4%
N 89738
 
10.1%
o 89738
 
10.1%
s 84748
 
9.5%
d 84748
 
9.5%
i 24056
 
2.7%
a 17018
 
1.9%
D 12028
 
1.4%
b 12028
 
1.4%
Other values (4) 26998
 
3.0%
Common
ValueCountFrequency (%)
84748
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 973644
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 266272
27.3%
e 181524
18.6%
N 89738
 
9.2%
o 89738
 
9.2%
84748
 
8.7%
s 84748
 
8.7%
d 84748
 
8.7%
i 24056
 
2.5%
a 17018
 
1.7%
D 12028
 
1.2%
Other values (5) 39026
 
4.0%

change_meds
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
0
54755 
1
47011 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 54755
53.8%
1 47011
46.2%

Length

2023-11-13T14:45:26.098755image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:26.201585image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
0 54755
53.8%
1 47011
46.2%

Most occurring characters

ValueCountFrequency (%)
0 54755
53.8%
1 47011
46.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 54755
53.8%
1 47011
46.2%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 54755
53.8%
1 47011
46.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 54755
53.8%
1 47011
46.2%

diabetes_meds
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
1
78363 
0
23403 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1 78363
77.0%
0 23403
 
23.0%

Length

2023-11-13T14:45:26.301191image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:26.404071image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
1 78363
77.0%
0 23403
 
23.0%

Most occurring characters

ValueCountFrequency (%)
1 78363
77.0%
0 23403
 
23.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 78363
77.0%
0 23403
 
23.0%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 78363
77.0%
0 23403
 
23.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 78363
77.0%
0 23403
 
23.0%

glimepiride
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
0
96575 
1
 
5191

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 96575
94.9%
1 5191
 
5.1%

Length

2023-11-13T14:45:26.491407image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:26.595701image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
0 96575
94.9%
1 5191
 
5.1%

Most occurring characters

ValueCountFrequency (%)
0 96575
94.9%
1 5191
 
5.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 96575
94.9%
1 5191
 
5.1%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 96575
94.9%
1 5191
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 96575
94.9%
1 5191
 
5.1%

glipizide
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
0
89080 
1
12686 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 89080
87.5%
1 12686
 
12.5%

Length

2023-11-13T14:45:26.681182image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:26.781899image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
0 89080
87.5%
1 12686
 
12.5%

Most occurring characters

ValueCountFrequency (%)
0 89080
87.5%
1 12686
 
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 89080
87.5%
1 12686
 
12.5%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 89080
87.5%
1 12686
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 89080
87.5%
1 12686
 
12.5%

glyburide
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
0
91116 
1
10650 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 91116
89.5%
1 10650
 
10.5%

Length

2023-11-13T14:45:26.872315image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:26.973838image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
0 91116
89.5%
1 10650
 
10.5%

Most occurring characters

ValueCountFrequency (%)
0 91116
89.5%
1 10650
 
10.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 91116
89.5%
1 10650
 
10.5%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 91116
89.5%
1 10650
 
10.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 91116
89.5%
1 10650
 
10.5%

insulin
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
1
54383 
0
47383 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1 54383
53.4%
0 47383
46.6%

Length

2023-11-13T14:45:27.068773image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:27.170784image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
1 54383
53.4%
0 47383
46.6%

Most occurring characters

ValueCountFrequency (%)
1 54383
53.4%
0 47383
46.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 54383
53.4%
0 47383
46.6%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 54383
53.4%
0 47383
46.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 54383
53.4%
0 47383
46.6%

pioglitazone
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
0
94438 
1
 
7328

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 94438
92.8%
1 7328
 
7.2%

Length

2023-11-13T14:45:27.258044image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:27.360769image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
0 94438
92.8%
1 7328
 
7.2%

Most occurring characters

ValueCountFrequency (%)
0 94438
92.8%
1 7328
 
7.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 94438
92.8%
1 7328
 
7.2%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 94438
92.8%
1 7328
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 94438
92.8%
1 7328
 
7.2%

repaglinide
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
0
100227 
1
 
1539

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 100227
98.5%
1 1539
 
1.5%

Length

2023-11-13T14:45:27.447681image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:27.550005image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
0 100227
98.5%
1 1539
 
1.5%

Most occurring characters

ValueCountFrequency (%)
0 100227
98.5%
1 1539
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 100227
98.5%
1 1539
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 100227
98.5%
1 1539
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 100227
98.5%
1 1539
 
1.5%

rosiglitazone
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
0
95401 
1
 
6365

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 95401
93.7%
1 6365
 
6.3%

Length

2023-11-13T14:45:27.636494image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:27.738425image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
0 95401
93.7%
1 6365
 
6.3%

Most occurring characters

ValueCountFrequency (%)
0 95401
93.7%
1 6365
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 95401
93.7%
1 6365
 
6.3%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 95401
93.7%
1 6365
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 95401
93.7%
1 6365
 
6.3%

metformin
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
0
81778 
1
19988 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 81778
80.4%
1 19988
 
19.6%

Length

2023-11-13T14:45:27.828046image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:27.933027image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
0 81778
80.4%
1 19988
 
19.6%

Most occurring characters

ValueCountFrequency (%)
0 81778
80.4%
1 19988
 
19.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 81778
80.4%
1 19988
 
19.6%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 81778
80.4%
1 19988
 
19.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 81778
80.4%
1 19988
 
19.6%

no_meds
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
0
78363 
1
23403 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 78363
77.0%
1 23403
 
23.0%

Length

2023-11-13T14:45:28.025908image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:28.134453image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
0 78363
77.0%
1 23403
 
23.0%

Most occurring characters

ValueCountFrequency (%)
0 78363
77.0%
1 23403
 
23.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 78363
77.0%
1 23403
 
23.0%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 78363
77.0%
1 23403
 
23.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 78363
77.0%
1 23403
 
23.0%

sum_visits
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
0
55828 
1
19941 
2
10062 
4
10031 
3
5904 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 55828
54.9%
1 19941
 
19.6%
2 10062
 
9.9%
4 10031
 
9.9%
3 5904
 
5.8%

Length

2023-11-13T14:45:28.223675image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:28.342956image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
0 55828
54.9%
1 19941
 
19.6%
2 10062
 
9.9%
4 10031
 
9.9%
3 5904
 
5.8%

Most occurring characters

ValueCountFrequency (%)
0 55828
54.9%
1 19941
 
19.6%
2 10062
 
9.9%
4 10031
 
9.9%
3 5904
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 55828
54.9%
1 19941
 
19.6%
2 10062
 
9.9%
4 10031
 
9.9%
3 5904
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 55828
54.9%
1 19941
 
19.6%
2 10062
 
9.9%
4 10031
 
9.9%
3 5904
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 55828
54.9%
1 19941
 
19.6%
2 10062
 
9.9%
4 10031
 
9.9%
3 5904
 
5.8%

is_diabetic
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.4 MiB
0
63518 
1
38248 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters101766
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 63518
62.4%
1 38248
37.6%

Length

2023-11-13T14:45:28.447912image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T14:45:28.551863image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
0 63518
62.4%
1 38248
37.6%

Most occurring characters

ValueCountFrequency (%)
0 63518
62.4%
1 38248
37.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 101766
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 63518
62.4%
1 38248
37.6%

Most occurring scripts

ValueCountFrequency (%)
Common 101766
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 63518
62.4%
1 38248
37.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 63518
62.4%
1 38248
37.6%

age_sq
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.511065
Minimum0
Maximum100
Zeros158
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2023-11-13T14:45:28.634881image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile9
Q136
median49
Q364
95-th percentile81
Maximum100
Range100
Interquartile range (IQR)28

Descriptive statistics

Standard deviation22.651429
Coefficient of variation (CV)0.4397391
Kurtosis-0.48076143
Mean51.511065
Median Absolute Deviation (MAD)15
Skewness-0.15732795
Sum5242075
Variance513.08724
MonotonicityNot monotonic
2023-11-13T14:45:28.732339image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
64 25334
24.9%
49 21885
21.5%
36 16780
16.5%
81 16777
16.5%
25 9407
 
9.2%
16 3677
 
3.6%
1 2757
 
2.7%
100 2714
 
2.7%
9 1609
 
1.6%
4 668
 
0.7%
ValueCountFrequency (%)
0 158
 
0.2%
1 2757
 
2.7%
4 668
 
0.7%
9 1609
 
1.6%
16 3677
 
3.6%
25 9407
 
9.2%
36 16780
16.5%
49 21885
21.5%
64 25334
24.9%
81 16777
16.5%
ValueCountFrequency (%)
100 2714
 
2.7%
81 16777
16.5%
64 25334
24.9%
49 21885
21.5%
36 16780
16.5%
25 9407
 
9.2%
16 3677
 
3.6%
9 1609
 
1.6%
4 668
 
0.7%
1 2757
 
2.7%

Interactions

2023-11-13T14:45:18.924156image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:44:59.088990image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:00.574532image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:02.131707image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:03.613288image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:05.238192image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:06.794059image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:08.310692image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:09.869280image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:11.428740image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:14.099072image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:15.663322image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:17.253306image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:19.037918image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:44:59.197358image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:00.682698image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:02.241961image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:03.725541image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:05.354547image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:06.901868image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:08.420771image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:09.983893image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:11.543252image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:14.217314image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:15.777650image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:17.363309image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:19.153501image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:44:59.307553image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:00.791836image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:02.353040image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:03.845069image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:05.467717image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:07.016953image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:08.537077image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:10.098674image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:11.667399image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:14.334979image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:15.894929image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:17.478550image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:19.261396image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:44:59.415412image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:00.898290image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:02.456422image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:03.977926image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:05.577478image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:07.124794image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:08.648007image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:10.212246image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:11.804348image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:14.446037image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:16.006197image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:17.614303image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:19.394212image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:44:59.528198image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:01.020756image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:02.570586image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:04.093469image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:05.701754image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:07.244337image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:08.765408image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:10.331258image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:11.949124image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:14.569865image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:16.127438image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:17.752237image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:19.521755image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:44:59.642780image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:01.172579image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:02.683581image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:04.208802image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:05.818027image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:07.361798image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:08.881459image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:10.451866image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:13.108559image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:14.691446image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:16.247303image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:17.902992image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:19.635514image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:44:59.756673image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:01.298403image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:02.795271image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:04.327810image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:05.933777image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:07.480345image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:08.999607image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:10.574799image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:13.229167image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:14.810845image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:16.377758image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:18.050317image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:19.750551image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:44:59.875031image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:01.422084image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:02.907548image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:04.446980image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:06.050623image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:07.596459image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:09.111499image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:10.697867image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:13.349893image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:14.931934image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:16.499756image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:18.175409image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:19.878677image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:44:59.993600image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:01.551398image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:03.026834image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:04.569224image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:06.170220image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:07.721359image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:09.235064image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:10.820863image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:13.491619image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:15.061460image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:16.645902image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:18.301530image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:19.995605image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:00.112979image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:01.664291image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:03.140654image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:04.687831image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:06.287939image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:07.838910image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:09.351792image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:10.942738image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:13.620020image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:15.180330image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:16.767941image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:18.423940image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:20.119457image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:00.231228image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:01.783732image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:03.258890image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:04.812547image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:06.423364image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:07.957262image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:09.481996image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:11.067140image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:13.741607image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:15.300766image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:16.889113image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:18.549595image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:20.238954image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:00.349210image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:01.902121image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:03.375711image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:04.936638image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:06.563426image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:08.078243image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:09.632210image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:11.189738image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:13.861540image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:15.423181image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:17.012817image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:18.678493image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:20.360286image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:00.464228image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:02.021110image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:03.494989image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:05.114811image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:06.682024image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:08.198899image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:09.754293image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:11.315140image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:13.986979image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:15.546812image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:17.136666image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-11-13T14:45:18.804950image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Correlations

2023-11-13T14:45:28.858194image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
agepayer_codedispositionadmission_sourcelength_staynum_testsnon_lab_proceduresnum_medsdiag_1diag_2diag_3num_diagage_sqraceinvisitsadmission_typeglucose_test_resulta1c_test_resultchange_medsdiabetes_medsglimepirideglipizideglyburideinsulinpioglitazonerepagliniderosiglitazonemetforminno_medssum_visitsis_diabetic
age1.0000.0680.298-0.0350.1150.025-0.0580.0280.0340.0700.0630.1901.0000.0830.0590.0520.0350.0940.0540.0440.0400.0620.0830.0930.0500.0510.0460.1110.0440.0510.269
payer_code0.0681.0000.002-0.048-0.033-0.041-0.0690.0260.0100.0430.0340.1080.0680.0530.0210.1030.0480.0380.0770.0460.0260.0110.0270.0670.0090.0300.0100.0440.0460.0260.042
disposition0.2980.0021.000-0.0220.2950.065-0.0090.1750.0460.0630.0650.1770.2980.0380.0520.0850.0730.0390.0400.0330.0210.0280.0560.0770.0310.0260.0260.0680.0330.0510.135
admission_source-0.035-0.048-0.0221.0000.004-0.2260.2250.0770.0220.004-0.008-0.160-0.0350.0570.0440.5880.3590.0750.0420.0290.0310.0170.0360.0430.0280.0310.0370.0530.0290.0570.051
length_stay0.115-0.0330.2950.0041.0000.3370.1870.465-0.0440.0740.0710.2370.1150.0140.0450.0280.0290.0480.1150.0700.0200.0210.0270.1040.0050.0360.0160.0200.0700.0270.127
num_tests0.025-0.0410.065-0.2260.3371.0000.0230.252-0.0780.0180.0170.1690.0250.0410.0270.1970.2430.1820.0700.0430.0210.0150.0140.0980.0170.0320.0090.0560.0430.0170.071
non_lab_procedures-0.058-0.069-0.0090.2250.1870.0231.0000.3520.0060.0570.0510.067-0.0580.0240.0410.1570.0450.0460.0270.0300.0130.0140.0140.0320.0120.0000.0140.0400.0300.0470.081
num_meds0.0280.0260.1750.0770.4650.2520.3521.0000.0250.0970.0820.2940.0280.0310.0570.0940.0290.0190.2440.1960.0450.0640.0490.2050.0710.0290.0550.0710.1960.0630.145
diag_10.0340.0100.0460.022-0.044-0.0780.0060.0251.0000.0050.0090.0120.0340.0430.0560.1640.0450.0960.0740.0600.0280.0610.0680.1380.0500.0130.0400.1200.0600.0490.343
diag_20.0700.0430.0630.0040.0740.0180.0570.0970.0051.0000.0630.1550.0700.0250.0340.0700.0210.0630.0460.0390.0280.0350.0440.0660.0190.0190.0270.0860.0390.0340.393
diag_30.0630.0340.065-0.0080.0710.0170.0510.0820.0090.0631.0000.1780.0630.0200.0380.0420.0150.0520.0280.0260.0230.0300.0400.0440.0220.0200.0150.0720.0260.0400.486
num_diag0.1900.1080.177-0.1600.2370.1690.0670.2940.0120.1550.1781.0000.1900.0560.0690.0930.0550.0470.0570.0320.0180.0230.0410.1070.0130.0400.0180.0790.0320.0870.404
age_sq1.0000.0680.298-0.0350.1150.025-0.0580.0280.0340.0700.0630.1901.0000.0810.0460.0500.0340.0820.0500.0390.0330.0560.0770.0730.0430.0510.0420.1070.0390.0400.247
race0.0830.0530.0380.0570.0140.0410.0240.0310.0430.0250.0200.0560.0811.0000.0470.0740.0400.0300.0200.0130.0230.0220.0290.0520.0270.0270.0120.0240.0130.0460.077
invisits0.0590.0210.0520.0440.0450.0270.0410.0570.0560.0340.0380.0690.0460.0471.0000.0320.0180.0570.0250.0330.0140.0220.0390.0710.0230.0200.0220.0750.0330.5890.056
admission_type0.0520.1030.0850.5880.0280.1970.1570.0940.1640.0700.0420.0930.0500.0740.0321.0000.3130.0670.0310.0190.0570.0090.0040.0420.0310.0590.0350.0560.0190.0440.032
glucose_test_result0.0350.0480.0730.3590.0290.2430.0450.0290.0450.0210.0150.0550.0340.0400.0180.3131.0000.0520.0570.0500.0340.0000.0080.0630.0170.0180.0050.0290.0500.0460.040
a1c_test_result0.0940.0380.0390.0750.0480.1820.0460.0190.0960.0630.0520.0470.0820.0300.0570.0670.0521.0000.1050.0880.0240.0200.0170.1060.0000.0260.0130.0470.0880.0600.115
change_meds0.0540.0770.0400.0420.1150.0700.0270.2440.0740.0460.0280.0570.0500.0200.0250.0310.0570.1051.0000.5060.1390.1980.1780.5140.2020.0750.1960.3240.5060.0490.059
diabetes_meds0.0440.0460.0330.0290.0700.0430.0300.1960.0600.0390.0260.0320.0390.0130.0330.0190.0500.0880.5061.0000.1270.2060.1870.5850.1520.0680.1410.2701.0000.0490.057
glimepiride0.0400.0260.0210.0310.0200.0210.0130.0450.0280.0280.0230.0180.0330.0230.0140.0570.0340.0240.1390.1271.0000.0740.0710.0060.0440.0050.0390.0410.1270.0170.012
glipizide0.0620.0110.0280.0170.0210.0150.0140.0640.0610.0350.0300.0230.0560.0220.0220.0090.0000.0200.1980.2060.0741.0000.1080.0350.0490.0170.0410.0780.2060.0090.014
glyburide0.0830.0270.0560.0360.0270.0140.0140.0490.0680.0440.0400.0410.0770.0290.0390.0040.0080.0170.1780.1870.0710.1081.0000.0790.0250.0250.0360.1400.1870.0390.009
insulin0.0930.0670.0770.0430.1040.0980.0320.2050.1380.0660.0440.1070.0730.0520.0710.0420.0630.1060.5140.5850.0060.0350.0791.0000.0040.0070.0030.0330.5850.0650.067
pioglitazone0.0500.0090.0310.0280.0050.0170.0120.0710.0500.0190.0220.0130.0430.0270.0230.0310.0170.0000.2020.1520.0440.0490.0250.0041.0000.0190.0650.0570.1520.0000.000
repaglinide0.0510.0300.0260.0310.0360.0320.0000.0290.0130.0190.0200.0400.0510.0270.0200.0590.0180.0260.0750.0680.0050.0170.0250.0070.0191.0000.0110.0000.0680.0220.017
rosiglitazone0.0460.0100.0260.0370.0160.0090.0140.0550.0400.0270.0150.0180.0420.0120.0220.0350.0050.0130.1960.1410.0390.0410.0360.0030.0650.0111.0000.0970.1410.0140.010
metformin0.1110.0440.0680.0530.0200.0560.0400.0710.1200.0860.0720.0790.1070.0240.0750.0560.0290.0470.3240.2700.0410.0780.1400.0330.0570.0000.0971.0000.2700.0540.054
no_meds0.0440.0460.0330.0290.0700.0430.0300.1960.0600.0390.0260.0320.0390.0130.0330.0190.0500.0880.5061.0000.1270.2060.1870.5850.1520.0680.1410.2701.0000.0490.057
sum_visits0.0510.0260.0510.0570.0270.0170.0470.0630.0490.0340.0400.0870.0400.0460.5890.0440.0460.0600.0490.0490.0170.0090.0390.0650.0000.0220.0140.0540.0491.0000.055
is_diabetic0.2690.0420.1350.0510.1270.0710.0810.1450.3430.3930.4860.4040.2470.0770.0560.0320.0400.1150.0590.0570.0120.0140.0090.0670.0000.0170.0100.0540.0570.0551.000

Missing values

2023-11-13T14:45:20.596149image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
A simple visualization of nullity by column.
2023-11-13T14:45:21.233807image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

raceagepayer_codeinvisitsadmission_typedispositionadmission_sourcelength_staynum_testsnon_lab_proceduresnum_medsdiag_1diag_2diag_3num_diagglucose_test_resulta1c_test_resultchange_medsdiabetes_medsglimepirideglipizideglyburideinsulinpioglitazonerepagliniderosiglitazonemetforminno_medssum_visitsis_diabeticage_sq
encounter_id
100000Caucasian73.001113390421633Not testedNot tested000000000010149
100020Caucasian56.00313290148355NormalNot tested110001000000125
100022Caucasian615.0061124382157735Not testedNot tested110011000100136
100060Not Registered715.01611283911431265Not testedNot tested110001000101149
100076Caucasian107.001112440931088Not testedNot tested0000000000100100
100078Caucasian77.0061126370218737Not testedNot tested111001100000149
100080Not Registered56.00613330269475Not testedNot tested010001000000025
100087AfricanAmerican87.0011152380877105Not testedNot tested010010000000064
100096Caucasian77.0001123350103337Not testedNot tested111001000000149
100108Caucasian97.0032134170237889DiabeticNot tested110001000000081
raceagepayer_codeinvisitsadmission_typedispositionadmission_sourcelength_staynum_testsnon_lab_proceduresnum_medsdiag_1diag_2diag_3num_diagglucose_test_resulta1c_test_resultchange_medsdiabetes_medsglimepirideglipizideglyburideinsulinpioglitazonerepagliniderosiglitazonemetforminno_medssum_visitsis_diabeticage_sq
encounter_id
999897Caucasian77.00611243801881689Not testedNot tested010100000000049
999906Caucasian77.0111117097779Not testedNot tested010100000004049
999911Caucasian87.00611257087376Not testedNot tested110101000000064
999928Not Registered66.0001121213127354Not testedNot tested010000000100136
999940Caucasian107.000112121177879Not testedDiabetic0000000000100100
999944AfricanAmerican88.0001121233167734Not testedNot tested010001000000164
999953Caucasian87.00111216001016879Not testedDiabetic010000100000064
999966Caucasian97.0112112280208779Not testedNot tested010001000004081
999968Caucasian77.031115540298739Not testedNot tested110101000003049
999980Caucasian60.00111261079349Not testedNot tested010001000000036

Duplicate rows

Most frequently occurring

raceagepayer_codeinvisitsadmission_typedispositionadmission_sourcelength_staynum_testsnon_lab_proceduresnum_medsdiag_1diag_2diag_3num_diagglucose_test_resulta1c_test_resultchange_medsdiabetes_medsglimepirideglipizideglyburideinsulinpioglitazonerepagliniderosiglitazonemetforminno_medssum_visitsis_diabeticage_sq# duplicates
0AfricanAmerican2-1.00111351033-1-11Not testedDiabetic010001000000142
1Caucasian0-1.00111347053-1-11Not testedDiabetic010001000000102
2Caucasian58.0401121196177779Not testedNot tested1100010000040252
3Caucasian97.000112112117778Not testedNot tested0100010000000812